There are 5343 vertebrate fauna datasets in Neotoma. Below is a map of all the sites in Neotoma corresponding to these datasets, colored by the constituent database to which they correspond.
The datasets associated with the insect database or which associated with no database (i.e., NA) are:
Of the 5343 vertebrate fauna datasets in Neotoma, there are a plurality, 2346 datasets, which only have a chronology in raw radiocarbon years. There are another 1197 which have both a raw radiocarbon chronology and a calibrated chronology (almost all of these are “Calibrated Radiocarbon Years”; three are “Calendar Years BP”).
To determine temporal coverage of the vertebrate fauna datasets, I had to select chronologies for each dataset. If a dataset only had a single chronology, I chose that. For the rest, this was the order of preference:
For the remaining 23 datasets, I chose a chronology randomly. I used the reliableagespan values associated with this API to get older and younger bounds associated with a dataset. Four of the FAUNMAP 1.1 datasets (IDs = 6450,6861,6972,7833) had NA older and younger bounds, so I removed those.
Below is a table of the ages of the chronologies I chose.
Given these chronologies, we found that most datasets (n= 3246) are from the Holocene epoch, nearly 1000 span the Holocene and Pleistocene, and nearly another 1000 are within the Pleistocene, with fewer further back in time.
This graph shows dataset-level temporal precision on a log scale. For datasets with an age type of calendar years AD/BC, I converted to BP. There are a few datasets which purport to be in years BP but aren’t actually… I think those ones are generally from FID.
And this graph is log-scale temporal precision of sample ages associated with vertebrate fauna datasets.
Should I interpret the values below as AD?
Below you can see when a dataset was originally uploaded to Neotoma, based on the recdatecreated field for the dataset.
I harmonized the data from vertebrate fauna datasets in Neotoma with a table that Val and Jessica made which also described the taxonomic precision of the identification. That table captured 567 of the 2883 taxa described in Neotoma vertebrate fauna datasets (19.67%), and 32,965 of the 78,415 of the observations in the datasets (42.04%).
Following harmonization with Jessica and Val’s table, and dropping those taxa which weren’t represented in it, here is a summary of the precision of the vertebrate fauna data overall:
I made an uncertainty metric to measure site-level precision. I assigned a certainty score to each rank, such that species = 1, genus = 2, family = 3, and in-between values get half points:
Then I checked at each site how many distinct identifications were made, summed the uncertainty value of those distinct identifications and divided by the total number of distinct identifications to get a site-level uncertainty metric. A value of 1 for a site would then mean that all the distinct identifications at that site were to species-precision. I did exclude all observations for which the taxon-harmonization table isn’t ready (2316 taxa; 45,450 observations).
The graph below shows how the uncertainty metric evolves with site-level taxonomic richness. The richest sites tend to have more precise identifications.
The map below shows the spatial distribution of taxonomic richness and uncertainty in separate layers.
Below is a graph that shows how precision has evolved over time. There’s no obvious change over time.
Below are maps of two key species from every order of mammals. The following orders are not represented:
I also didn’t consider extinct orders. I think Psittacotherium multifragum may be from an extinct order.
I matched taxa in Neotoma to taxa recoded by the Mammal Diversity Database, in order to get information on order, family, and genus for each identification.
The Mammal Diversity Database won’t match taxa starting with ? or cf., or with sp. in the name, or extinct species. For ?, cf., and sp. or spp., I abstracted that from the Neotoma name to aid with the identifications. (I kept the presence or absence of those markers in a new column though, so it’s not lost, but I haven’t used that information in the below visualizations.)
I ended up matching 52,477 observations out of 78,415, and 1249 out of 2883 taxa. There’s a bias toward missing all the extinct species, so you don’t see the pilosa or cingulates from the Americas that you might otherwise
For the sake of simplicity, I ignore cf. and sp. in my table below. (But I retain that information in the dataframe.)
The graph below shows the representation among taxa matched to the MMD of the various mammal orders.
And below that, the table shows the number of distinct families, genera, and species represented in Neotoma for each order represented.
The reason that sirenia has two families but fewer genera and species is that only one of the Sirenia observations went to species-precision, while the other two only went to family, so I didn’t count those two in the genera and species column. Similarly, proboscideans are not represented in the table but they are in the graph because there were MMD-matched taxa to order-precision for proboscideans, but nothing more precise than that. Of course, Neotoma has a lot of mammut and mammuthus, but those taxa IDs were not matched in the MMD, probably because the MMD seems only to concern itself with extant taxa.